Individualization of Intensity Thresholds on External Workload Demands in Women’s Basketball by K-Means Clustering: Differences Based on the Competitive Level

In previous studies found in the literature speed (SP), acceleration (ACC), deceleration (DEC), and impact (IMP) zones have been created according to arbitrary thresholds without considering the specific workload profile of the players (e.g., sex, competitive level, sport discipline). The use of statistical methods based on raw data could be considered as an alternative to be able to individualize these thresholds. The study purposes were to: (a) individualize SP, ACC, DEC, and IMP zones in two female professional basketball teams; (b) characterize the external workload profile of 5 vs. 5 during training sessions; and (c) compare the external workload according to the competitive level (first vs. second division). Two basketball teams were recorded during a 15-day preseason microcycle using inertial devices with ultra-wideband indoor tracking technology and microsensors. The zones of external workload variables (speed, acceleration, deceleration, impacts) were categorized through k-means clusters. Competitive level differences were analyzed with Mann–Whitney’s U test and with Cohen’s d effect size. Five zones were categorized in speed (<2.31, 2.31–5.33, 5.34–9.32, 9.33–13.12, 13.13–17.08 km/h), acceleration (<0.50, 0.50–1.60, 1.61–2.87, 2.88–4.25, 4.26–6.71 m/s2), deceleration (<0.37, 0.37–1.13, 1.14–2.07, 2.08–3.23, 3.24–4.77 m/s2), and impacts (<1, 1–2.99, 3–4.99, 5–6.99, 7–10 g). The women’s basketball players covered 60–51 m/min, performed 27–25 ACC-DEC/min, and experienced 134–120 IMP/min. Differences were found between the first and second division teams, with higher values in SP, ACC, DEC, and IMP in the first division team (p < 0.03; d = 0.21–0.56). In conclusion, k-means clustering can be considered as an optimal tool to categorize intensity zones in team sports. The individualization of external workload demands according to the competitive level is fundamental for designing training plans that optimize sports performance and reduce injury risk in sport.


Introduction
The training process in team sports is based on a rigorous and methodical process for designing the external and internal workload administered to the players [1]. The management of training workload is essential to optimize sports performance, reduce injury risk, and avoid overtraining, as well as to monitor the evolution of the players' physical level throughout their sport career [2]. For this purpose, the recording of internal and external workload is fundamental for obtaining objective data on the training process in team sports, and specifically in basketball [3].
Internal and external workload in basketball can be quantified using different methods depending on the available resources [4]. For time-motion analysis, the first method utilized was video-tracking [5], but due to the high cost and difficult data processing, this method has been largely replaced by radiofrequency technologies in indoor conditions with antennas as reference system such as ultra-wideband (UWB) or Bluetooth [6]. These data can be complemented with microsensors to record how the players' movements influence the workload supported by the musculoskeletal structures of the body [7]. Regarding internal workload, the most extended method is heart rate telemetry, although different hematological markers such as blood lactate, cortisol, insulin, and glucose have also been measured [4].
The monitoring of workload will allow team staff to ascertain the mechanical and locomotor stress suffered during efforts as well as the biological reaction of the player's body [8]. Following the principles of training, the dose-response is individually based on physiological (e.g., age, physical fitness), psychological (e.g., perceived effort, motivation), environmental (e.g., sport discipline, competitive level, period of the season), and genetic (e.g., sex) factors [9]. Thus, the thresholds of intensity zones related to speed, acceleration, deceleration, and impacts suffered by the players should be adapted to the individual characteristics of the athletes [10].
For this purpose, recent studies proposed different methods to calculate intensity thresholds such as: (a) based on maximum values reported in the literature combined with own competition and training data [18], (b) using Gaussian distributions with unknown parameters [19], (c) applying the k-means clustering algorithm [10], or (d) through a spectral clustering algorithm [20]. In basketball, only one previous study has used a k-means clustering algorithm to individualize speed zones in women youth players (standing, <3.6 km/h, walking: 3.6-6.5 km/h, jogging: 6.5-10.2 km/h, running: 10.2-14.4 km/h, sprinting: >14.4 km/h) [21], but there is no approximation to the individualization of intensity thresholds in accelerations, decelerations, and impacts.
Therefore, due to the importance of individualized thresholds to identify the specific values based on sport discipline, players' level, or sex [22], as well as the lack of research about the individualization of speed, acceleration, deceleration and impact zones in basketball, and specifically in women's basketball, the purposes of the present study were: (a) to conduct an individualization of the work zones of each variable in two professional women's basketball teams, using the k-means cluster algorithm to ascertain the different training intensities in SP, ACC, DEC and IMP; (b) to characterize the external workload demands of the selected variables related to the 5 vs. 5 profile in the total and normalized variables, and (c) to compare the external workload in a 5 vs. 5 game situation in the training sessions according to the competitive level (first vs. second division).

Design
This research is classified within the empirical studies that follow an associative strategy through a cross-sectional comparative design [23] that explore the thresholds of intensity zones in external workload, characterize the performance of women's basketball players in training games (5 vs. 5) as well as examine the differences between teams according to the competitive level (first vs. second division).

Participants
Twenty-two professional women's basketball players that belonged to two elite-level teams (first division, Liga Femenina 1; second division, Liga Femenina 2) participated in the present study (first division, n = 10, age = 22.51 ± 2.68 years, height = 1.81 ± 0.08 m, body mass = 75.58 ± 12.32 kg; second division, n = 12, age = 21.79 ± 2.45, height = 1.77 ± 0.09 m., body mass = 78.32 ± 11.55 kg) They were evaluated during 15 days of the preseason period in which they performed 10 training sessions. The players that took part in the present study met the following inclusion criteria: (a) they had participated in all training sessions during the 15-days microcycle and in all tasks in each session; (b) they had not presented musculoskeletal injuries and health problems in the previous two months; (c) they were familiarized with high-level monitoring during more than 10 training sessions or competitive games; and (d) they had played at the maximum competitive level in any country for at least 2 years before the study.
The players, coaches and managers of the teams were informed before the investigation about the possible risks and benefits of participation. An informed consent form was signed by the coaching staff, managers, and basketball players of each team. The research was carried out under the criteria of the Declaration of Helsinki (2013) and was approved by the Bioethics committee of the University (233/2019).

Variables
For this research, the competitive level of the teams (first women's division vs. second women's division) was considered as an independent variable. For the evaluation and to achieve the proposed objectives, the following dependent variables were chosen that are widely used in basketball [11,12,24]: All the variables were grouped into five work zones by the k-means clustering algorithm.

Equipment
Data were recorded with WIMU PRO TM inertial devices (RealTrack Systems, Almeria, Spain). Distance covered at different speeds, accelerations and decelerations were obtained through ultrawide-band (UWB) tracking technology at 33 Hz. This UWB system was designed to replace the satellite reference in indoor conditions [25] and consisted of a transmitter reference system (antennas) and receiver (devices). The reference system is composed of eight antennas placed in the corners (n = 4, 5 m from the perimeter), on the middle line (n = 2, 7 m from the perimeter) and behind the baskets (n = 2, 7 m from the perimeter), forming an octagon and positioned at a height of 3 m. Switch-on and calibration processes were performed following the manufacturer's recommendations that presented almost perfect validity and reliability [26]. Regarding impacts, the inertial device was composed of different microsensors (four accelerometers: 2× ± 16 g, 1× ± 32 g and 1× ± 400 g; three gyroscopes 2000 • /s; one magnetometer) that were set at 100 Hz and presented almost perfect validity in accelerometer raw data [27]. Devices were located at the inter-scapular level in each player with an anatomical harness. The registered data were analyzed through the SPRO TM software (RealTrack Systems, Almeria, Spain).

Procedures
First, the clubs were contacted to inform them about the study purposes and to invite them to participate. Once the proposal was accepted, an informed consent form was signed by coaches and players. The teams performed five training sessions on the court and one competitive game each week. During training sessions, the women's basketball players were monitored with the inertial devices. Firstly, the antennae system was installed around the court and then the devices were placed 30 min prior to the session in a neoprene anatomical vest at the scapular level. During sessions, S VIVO TM specialized software was used for the time selection of each task. After the end of each session: (1) data were downloaded to a laptop, (2) data were introduced into SPRO TM manufacturer's software to export external workload variables, (3) external workload variables were uploaded to the WIMU cloud storage and, (4) the report of the session was generated, and an informative dossier was sent daily for the team staff detailing the relevant information of the session. When the analysis and team report were completed, all the tasks performed by both teams involving five vs. five game situations on the full court of play were selected. Although the monitoring was during all training sessions, in this research only the five vs. five situations were analyzed. The duration of the task could vary depending on different aspects (first division team ≈ 8 min, second division team ≈ 6 min). However, when taking a break (to rest or perform another task, for example free throw shots), the task ended and if it was carried out again after the break, it was re-analyzed as another task (several data collections could be obtained in the same training session).
The raw data from each of the training tasks was analysed using the k-means clustering algorithm. The results of this algorithm were used to configure the SPROTM software with the specific ranges in which the external load variables of the research were classified.
Finally, the results were obtained for each of the research variables classified into the five groups defined in the k-means cluster specific to the population of professional women's basketball players, first division, second division and combined. These results have served to identify the differences in the external load in players at two different competitive levels.

Statistical Analysis
Firstly, data raw of total acceleration (AcelT, sum vector of acceleration in the three planes of movement, g force) and UWB speed (km/h) channels, as well as acceleration and deceleration values (m/s 2 ) of each positive and negative change of direction generated by all players during sessions were imported to the statistical package. Three analyses with the k-means clustering algorithm based on five zones following previous basketball research were conducted [21]: (1) first division team, (2) second division team, and (3) total team data. In addition, the results pertaining to the quality of the k-means clustering are shown.
Then, distance covered in each speed zone, number of accelerations and decelerations in each speed zone and number of impacts at each intensity were exported as total (accumulated value in each task) and relative (accumulated value in each task divided by the total time in minutes) variables to characterize the volume and intensity of tasks, respectively. Data normality and homoscedasticity were explored with the Kolmogorov-Smirnoff and Levene tests, showing a non-parametrical distribution. For this reason, external workload variables were characterized in the descriptive analysis as median and range (upper and lower values) and in plots as a histogram.
Finally, the Mann-Whitney U test was conducted to analyze the effect of competitive level in total and relative variables. The effect size of differences was obtained with Cohen's d and interpreted as follows: d < 0.2 as trivial, d = 0.2-0.5 as small, d = 0.5-0.8 as moderate, and d > 0.8 as large [28]. Statistical differences were considered if p < 0.05. Data analysis was performed using the Statistical Package for the Social Sciences (SPSS, IBM, SPSS Statistics, v.25.0 Armonk, NY, USA) and graphs were made using Prism software (GraphPad Software, San Diego, CA, USA). Table 1 presents the results of the k-means cluster analysis in five groups, following previous studies that have been referenced and that have been used in this process.  Table 2 shows the results pertaining to the quality analysis of the k-means clustering carried out in the research.  Figures 1 and 2 show the histograms on the variables related to distance and impacts, and accelerations and decelerations, respectively. The results are similar in both teams analyzed, although the first division team presents a greater number of high intensity actions than the second division team.     Table 3 shows the external workload variables according to the volume of demands in the women's basketball teams. Significant differences were observed according to competitive level (first and second division) in all variables analyzed except the highest intensity zone in accelerations and decelerations (p > 0.31). The effect size obtained is also high in all the variables except for the variables that do not show significant differences.  Table 4 shows the results of the external load variables according to the intensity of the demands. Statistical differences were found between competitive level in distance covered at jogging and running, in accelerations at total, low and moderate intensity, in decelerations at total, low, moderate and high intensity, and in impacts at all intensities with higher values in the first division team. Furthermore, the effect size shows high values in decelerations (d = 0.57-0.84) and low values in the rest of variables (d < 0.46).

Discussion
The control and quantification of the loads that the player supports during training or competition is a topic on the rise in recent years. However, this object of study is notably reduced when the selected sport is basketball and the sample comprises women players [24]. For this, the use of technology such as inertial devices has been facilitated and linked to the scientific and professional field of training in search of common lines of investigation [7]. However, in most cases, variables that are not optimal for the selected population are used to quantify the demands that an athlete supports [29,30]. Part of this problem is the generic use of ranges of these variables (volume or intensity) without individualizing the sample [31], causing serious consequences that affect the planning and results of a team during the competition. Therefore, the use of k-means clustering will help the results of the analyzed players to be individualized and established optimally as previous research existing in the literature [21,32], following one of the main principles of training.
Reviewing the literature, there are different investigations that use cluster analysis [11]. However, it is not a common practice due to the need for the previous data that are required for the analysis [21,32]. According to the results, differences were obtained in this research because of the individualization of the competitive process (sex, competitive level or characteristics of the sample) [11]. All aspects have an impact on the demands that players support, and therefore, on the workload demands performed during the game [12]. In contrast to using k-means clusters, the vast majority of investigations adjust their ranges of variables according to other investigations (without considering the individual characteristics of the sample) or with the values provided by the manufacturer [29]. This is an error that eliminates the individualization of the training process, which is one of the main objectives of the quantification and control of the competition and training workloads.
Regarding the demands that the analyzed players supported, differences are observed in the recorded variables depending on the competitive level, both in their volume and intensity. The studies that use k-means clustering to individualize speed thresholds have been carried out in different contexts (formative players, PE lessons, youth nationallevel women players) from the present study (five vs. five in full-court without breaks). González-Espinosa et al. [32] evaluated under-12 basketball players during four vs. four games in school competition, finding four work zones (<6 km/h; 6-12 km/h; 12-18 km/h; >18 km/h). Similarly, Gamero et al. [33] established four work zones through cluster analysis as follows (<5.2 km/h; 5.2-10.5 km/h; 10.5-15.7 km/h; >15.7 km/h). Along the same lines, García-Ceberino et al. [34] employed the K-means clustering method with five levels to differentiate PL/min load ranges in the context of PE, as load ranges should be adapted to the study population. Finally, Reina et al. [21] performed a cluster analysis during a competition in women players that were in their last formative stage (youth, close to amateur age) and the values were grouped into five categories (<3.6 km/h; 3.6-6.5 km/h; 6.5-10.2 km/h; 10.2-14.4 km/h; >14.4 km/h). The differences between previous research and the present study could be related to the multitude of cases collected in different training sessions in the same sample, as well as the specific characteristics of the players and playing context (e.g., minutes played, competitive level of the rival, players' ages, players' sex). These differences confirm the importance of individualization of the process and variability depending on the selected sample.
Finally, the main strength of this research was that it is the first approach of the utilization of k-means clustering to obtain reference values and individualize the training processes in first and second division women's basketball players in five vs. five training tasks during a competitive period. In addition, they provide information on the workload that the players supported in the face of the same stimulus (training game) in which the demands are different depending on the competitive level. Therefore, this research provides relevant and specific information for the basketball team staff and sport scientists that may be a booming topic in coming years due to the exponential growth of Big Data and the continuous search for the individualization of a workload based on players' characteristics and contextual aspects of the analyzed teams. The initial hypothesis on the research is that the five vs. five situations during training try to resemble competition situations, although this is not always achieved. Therefore, players with a good sport level, first or second division, have similar responses during training, identifying significant differences in their behaviors, although these are not very large.

Conclusions
From the results obtained in the present study that carried out a first approach to individualize work zones in first and second division women's basketball players through a k-means cluster algorithm, different conclusions and practical applications can be drawn: The characterization of women's basketball demands is important to quantify specific training and competition workloads adapted to the physical fitness of players and sport discipline. The use of mathematical processes like k-means clustering helps the establishment of thresholds in an objective way. 2.
Differences in speed, changes of speed and impact thresholds were found between categories. First division players obtained higher thresholds in all types of movements and at all intensities, especially in changes of speed. Therefore, the competitive level is one of the aspects to consider when work zones are individualized.

3.
The competitive level also affected the duration of five vs. five full-court training tasks. The first division team performed five vs. five full-court tasks with longer duration (~8 min, (7 30 to 8 40 ) than the second division team (~6 min). For this reason, a higher volume of distance, accelerations, decelerations and impacts at all intensities were found in the first division team. The adaptation of task duration is fundamental for achieving the desired training objectives. A lower competitive level is normally linked to lower physical fitness so that shorter tasks are necessary to maintain high intensity with longer between-repetition breaks.

4.
When variables were relativized to task duration, the first division team players covered a greater distance jogging and running, recorded higher positive and negative changes of speed and impacts at all intensities except in the very high zone. In this respect, the competitive level not only affected the workload volume in relation to task duration, but the movements were performed at higher intensity in the first division players. Therefore, an adaptation of the training, both in volume and intensity, is necessary to achieve the desired performance enhancement and reduce injury risk.
The main limitations of the research focus on the analysis of two teams in five vs. five situations in training sessions and not during competition. In addition, the analysis was carried out during the pre-season period. In order to improve the results obtained, it would be interesting for future research to make a comparison between five vs. five in competition and training sessions, as well as at different times of the season. Funding: The author Carlos D. Gómez Carmona was supported by a grant from the Spanish Ministry of Education, Culture, and Sport (FPU17/00407). This study was co-funded by the Regional Department of Economy and Infrastructure of the Government of Extremadura (Spain) through the European Funds of Regional Development of the European Union (dossier number: GR21149). This study was co-funded by the Spanish National Agency of Investigation through the project "Scientific and Technological Support to analyze the Training Workload of Basketball teams according to sex, level of the players and season period" (PID2019-106614GB-I00).
Institutional Review Board Statement: Club managers, technical staff and players were previously informed about the investigation details and signed informed consent forms. The study was performed based on the ethical guidelines of the Declaration of Helsinki (2013) and approved by the Bioethics Committee of the University (registration number 232/2019).

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the Organic Law 3/2018, of 5 December, on the Protection of Personal Data and Guarantee of Digital Rights of the Government of Spain, which requires that this information must be in custody.